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Field of the Invention 

The present invention relates to visualization systems and methods and more 
particularly to systems and methods for enabling visualization of data sets containing 
1 5 large numbers of objects. 

Description of the Related Art 

U.S. Patent No. 5,966,139, granted to Anupam et al. and assigned to Lucent 
Technologies Inc., relates to scalable data segmentation and visualization. In 

20 particular, the '139 patent describes visualization of relatively large amounts of data in 
a limited display space, including segmentation of data, mapping of segments to a 
node within a display space based upon a specified nodal layout, texture mapping each 
graphical display to its node, and displaying the data at each node. The visualization 
scheme enables a user to map a relation to a specification of an n-dimensional display, 

25 by designating how attributes are to be used. 



Background 



However, the vilMization system of the ' 1 39 patent fail^l^represent data sets 
having two properties, namely that subsets of items in a data set relate to each other, 
and the relationships between items have an affinity value associated with each other. 
Furthermore, a system according to the ' 1 39 patent does not simplify the display of 
5 data to the user by presenting multiple screens of only the amount of data that can be 
comfortably fit on a screen. 



Summary of the Invention 
The present invention discloses methods and systems for the visualization of 
10 data sets containing a large number of interrelated objects (or items) from data sets 
having two properties, namely that subsets of items in a data set relate to each other, 
and the relationships between items have a value associated with each other. In 
particular, according to an embodiment of the present invention, a compact, easily 
understood, and easily navigable visual representation of objects in a data set can be 
15 achieved. 

As a preparatory step, local rankings of the relationships between items are 
established, by ranking the items i that relate to each item j, and ranking all items k to 
which item j relates, thereby ranking the affinity to each item j of item sets i and k. 
Next according to an embodiment of the present invention, a visualization can be 

20 generated, by presenting results separately for each item in a predetermined data set 
and adjusting the presentation to avoid information overlap and overload. Separate 
representation of each item of the data set can be accomplished by generating an 
affinity chart for each item j in the data set, to display items closely related to selected 
item j, with item j being placed prominently in the affinity chart, and placing items 

25 which are more strongly related to item j closer to item j. Further, closeness is 





expressed along curves^rshaped segments which may be compi^^ly or partially 
straight, which are connected or which are emanating from item j's position. 

According to one embodiment of the present invention, continuous curves 
including but not limited to spiral segments, are employed to connect items relating to 
5 item j at different intensity levels. To adjust the visualization to avoid information 

overlap and overload, the items related to a particular item j are grouped by strength of 
affinity. Each related item is individually spaced on the affinity chart, with each item 
being placed in a non-overlapping position. Items with large numbers of related items 
are presented with multiple affinity charts. In the case of multiple affinity charts, a 
10 first affinity chart visualizes a set of most strongly related items. Next or subsequent 
related affinity charts visualize less strongly related items. According to an 
erribodiment, curves can be used to represent the relationship of items related to a 
particular item positioned at a starting point for the curve. Distance along the affinity 
curve represents the strength of the affinity to the item at the starting point of the 
1 5 curve. Color and shading gradations and curve thickness gradations are selectively 
employed to emphasize the curve's role in conveying affinity strength. Items are 
placed so they do not overlap or crowd each other. Arbitrarily large data sets are 
visualized using low and localized computational resources. 

20 Brief Description of the Dra wings 



Fig. 1 is an affinity chart according to an embodiment of the present invention. 



Fig. 2 is an affinity chart of other related items for the affinity chart of Fig. 1 . 



Fig. 3 is a flowchart illustrating one method of visualizing large interrelated 



data sets. 



3. 



Fig. 4 is a data 




diagram showing the flow of data tl 




;h a system for 



visuaHzing large interrelated data sets. 

Fig. 5 depicts a system for visualizing large interrelated data sets. 

Fig. 6 illustrates a type of database structure that can be used as a source for 
the rankings used by a system visualizing large interrelated data sets. 

Fig. 7 is another affinity chart, according to the present invention. 

Fig. 8 is another affinity chart of other related items for the affinity chart of 



Visualization of a data set refers to the use of various techniques to convey the 
overall structure of information by visual means. In particular, visual cues can be 
used to represent relations between objects. Visual cues can include, for example, 
using segments of a curve to represent the affinity (or strength) of the relationship 
between objects, and using gradations of the width and color of the curves to represent 
the intensity of the affinity relationships between the objects. 

Referring now to Fig. 1 , there is shown a diagram of most closely related items 
according to the present invention. In particular, Fig. 1 shows an affinity chart 1 29 
including first and second affinity curves 130 and 140 including a principal item 131 
and first and second pluralities of related items 132 and 142. Each related item 132, 
142 includes a navigational link 133, 143 respectively and a search link 134, 144 
respectively. Adjacent to the principal item 131 and at one end of a selected strings of 
related items 132 and 142 are respective first sequence element 135 and 145, and 
adjacent to a last item of the selected strings of related items 132, 142 is a second 



Fig. 7. 



Detailed Description 



,^^^which provides a link to a suppleme^f^ ; 



sequence element 136, Wo which provides a link to a supplemenSffy affinity chart for 
more remotely relevant strings of related items 132, 142. 

In an embodiment, the affinity chart 129 may consist of a single list of textual 
or graphical items and associated links. The principal item 131, related items 132 and 
5 142, first sequence elements 135, 145, and second sequence elements 136, 146 may 
all appear as items of the list. In this list, each related item may appear associated 
with a navigational link 133, 143, and a search link 134, 144. Such an affinity chart 
may be required when the display in which it is presented only accommodates lists. 

Referring now to Fig. 2, there is shown a diagram of other lesser related items 
10 to principal item 131 than those shown in Fig. 1, according to the present invention. 
In particular. Fig. 2 shows an affinity chart 249 representing those items that would 
have been reached as a result of selecting first sequence element 135 in Fig. 1. 
Affinity chart 249 in Fig. 2 includes first and second affinity curves 250 and 260 
including a principal item 1 3 1 and first and second pluralities of related items 252 and 
15 262 that are lesser related to principal item 131 than related items 132 and 142 shown 
in Fig. 1. Each related item 252, 262 includes a navigational link 253, 263 
respectively and a search link 254, 264 respectively. Adjacent to the principal item 
131 and at one end of a selected strings of related items 252 and 262 are respective 
first sequence elements 255 and 265, which provide a link to the affinity chart of more 
20 strongly related items, and adjacent to a last item of the selected strings of related 
items 252, 262 is a second sequence element 256, 266 which provides a link to a 
supplementary affinity chart for more remotely relevant strings of related items 252, 
262. By selecting the first sequence element 255 or 265, the user can navigate to 



5. 



reach the affinity cui 




AO or 250 to view a string of relevan^^ns as shown in Fig. 




1. 



Fig. 3 is a flow chart depicting a method for visuahzation of large interrelated 
data sets, according to the present invention. According to this method, the 
5 visualization of data sets containing a large number of items from data sets having two 
properties, namely that subsets of items in a data set relate to each other, and the 
relationships between items have a value associated with each other, is enabled. 

In an embodiment of the invention, a method of visualization of large 
interrelated data sets can include an information structuring phase 305, a chart layout 

1 0 phase 3 1 0, and an information linking phase 315. In information structuring phase 
305, the relationships between objects in a data set and the intensity of those 
relationships can be computed. As a preparatory step in the information structuring 
phase, an item from the data set can be selected in step 320. Local rankings of the 
relationships between items can be estabHshed in step 324, by ranking for each 

15 selected item j the items i that relate to that item j, and then ranking all items k to 
which item j relates, thereby ranking the affinity for each item j to item sets i and k. 
As a result, the rankings of the related items are relative (or local) to item j, but are not 
a universal measurement of the importance of the item. To determine how an item 
relates to another item, the strength of the relationship between the items can be 

20 computed using any combination of objective or subjective criteria, or a combination 
of both. A value can be associated with each criterion, and a plurality of these values 
can be reduced to a single value (i.e. an affinity value), by, for example, adding them 
all together and then normalizing the value. 



25 set i and item set k related to it. Spiral curve 130 depicts item set i (i.e. those items 



Referring back to Fig. 1, principal node 131 ("The Beatles") would have item 
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lode 131), while spiral curve 140 depWTs i 



that are related to principlTnode 131), while spiral curve 140 depWTs item set k (i.e. 
those items to which principal node 131 relates). To determine both item set i and 
item set k in Fig. 1, both objective and subjective criteria related to The Beatles could 
be used. Objective criteria used to determine the relationships between various 
5 musical bands could include, for example, the era in which the band played (e.g. 
1960s), and the genre of the music (e.g. rock, British Invasion, pop). Subjective 
criteria could include, for example, how well liked the band is based on feedback from 
users, and how often two bands appear together in radio station play lists. 

In an embodiment, a single affinity value can be determined from the various 

10 criteria to represent the strength of the relationship between objects, and the related 
items to the selected item can be ranked in step 324 by the affinity values associated 
with each related item. Once ranked, those related items can be clustered in step 328. 
Clustering refers to the process of appropriately grouping the ranked objects. For 
example, an arbitrary number (e.g. ten) of the most closely related items can be chosen 

15 as a cluster. Once clustered, the number of affinity charts can be computed in step 

332. Thus, using the above example, if twenty-eight related items exist, step 332 can 
result in a computation of three affinity charts needing to be generated, two of which 
would have ten items and the third having eight items. The first ten would be the 
most closely related to the principal item, the next ten would be the next most closely 

20 related, and the last eight would be the next most closely related. 

Next in an embodiment of the invention, chart layout phase 310 can cause the 
organization and placement of the relevant subsets of the objects in charts or graphs 
that can be used to display the relationships between the objects. The complete set of 
charts constitutes a virtual map of the data set. 
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;e 3 1 0 can begin by positioning the seWct€ 



Chart layout pffase 310 can begin by positioning the seWcted item as a 
principal node in a chart in step 338. This can include the placement of a hyperlink 
for the selected item in a specified position of the chart. Next a visualization can be 
generated, by presenting results separately for each item in a predetermined data set 
5 and adjusting the presentation to avoid information overlap and overload. In an 

embodiment, separate presentation for each item of the data set can be accomplished 
by generating an affinity curve in step 342 for each item j in the data set, to display 
items closely related to selected item j, with item j being placed prominently in the 
affinity chart, and placing items which are more strongly related to j closer to j. 
10 Further, closeness can be expressed along curves or shaped segments which may be 
completely or partially straight, and which may be connected or may emanate fi-om j's 
position. 

According to an embodiment of the present invention, continuous curves 
including but not limited to spiral segments, can be employed to connect items 
1 5 relating to j at different intensity levels. In step 346, a related item can be selected and 
in step 350, the size required for that related item can be determined. In step 354, the 
related item can be individually spaced on the affinity chart by its rank, with each item 
being placed in a non-overlapping position by allowing sufficient vertical and 
horizontal displacement in step 358. A determination can be made in step 360 of 
20 whether any more related items need to be placed on the affinity curve. If so, control 
returns to step 346 where the next item is selected. If there are no more related items 
to process, the color and size gradients of the curve can be adjusted in step 362 to 
emphasize the affinity between the items. 

A selected item with a large number of related items can be presented with 
25 multiple affinity charts. In the case of multiple affinity charts, a first affinity chart can 
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t of most stronely related items. NexWr s 



be used to visualize ^et of most strongly related items. NexWr subsequent related 
affinity charts can be used to visualize less strongly related items. 

According to an embodiment, arbitrarily large data sets can be visualized using 
low and local computational resources. During information linking phase 315, the 
5 sequence of affinity charts for a selected item can be hyperlinked in step 366. Each 
related item can then be linked to its own chart in step 370, Once the selected item 
and its associated affinity curves fi*om the data set have been hyperlinked, navigation 
by the user can occur by the user clicking to connect to a selected related affinity 
chart. Further, as a result of information linking phase 315, each item may have 

1 0 separate features that can be activated by the user clicking on those features. 

Upon completion of information linking phase 3 1 5, a determination can be 
made in step 38 1 of whether more charts need to be generated for the selected item. If 
so, control can pass to step 338 where the selected item can be placed in a new chart. 
If no fiirther charts are needed for selected items, a determination can be made in step 

15 386 of whether more items in the data set need charts generated for them. If so, 

control can pass to step 320 where a determination is made of a new selected item and 
the process repeats. 

Fig. 4 is a data flow diagram showing the flow of data through a system for 
visualizing interrelated data sets, according to an embodiment of the present 

20 invention. In the system shown in Fig. 4, a user 405 of a computer 412 containing a 
client browser 410 (as is well known and understood in the art) can send a request 415 
via a distributed computer network (such as the Intemet) to a web server 420. Web 
server 420 can process the request and ultimately provide the interface for user 405, 
including the ability to navigate visualized data sets. 
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tie request, web server 420 can procesTOi€ 



Upon receivin^he request, web server 420 can procesWhe request and send 
the request on to chart server 425. A determination can then be made in step 430 of 
whether the particular chart corresponding to the request had previously been cached. 
This could have occurred if a prior request had been made for the same data. If so, the 
5 necessary information can be provided from chart cache 435. By way of example 
only, the information could consist of ch£irts and hyperlinked data 440. 

If the particular chart corresponding to the request had not previously been 
cached, the request can be passed on to affinity server 445. Affinity server 445 can 
have available to it object affinity tables 450 that contain infomiation regarding the 
10 items in the data set, including, for example, the fields detailed in the affinity tables 
shown in Fig. 6. After receiving a request, affinity server 445 can access object 
affinity tables 450 for the information needed to structure the chart corresponding to 
the request, according to information structuring phase 305 described with respect to 
Fig. 3. 

1 5 After being retrieved from chart cache 435 or generated by affinity server 445, 

the charts and hyperlinked data 440 can be passed to chart server 425 where the 
visualized data set can be graphically laid out and linked together. Upon completion, 
chart server 425 can pass the graphically laid out and linked visualized data set to web 
server 420, which, in turn, can pass it on to client browser 410 being used by user 405. 

20 As a result, the visualized data set can be displayed to user 405. 

Fig. 5 depicts an embodiment of a system according to the present invention 
for visualizing large interrelated data sets. As shown in Fig. 5, web server 505 can 
receive image request 507 for a particular artist. Upon determining that image request 
507 is to generate a visualization of data from a data set, that request can be passed to 

25 chart server 510. Chart server 510 can then determine whether the necessary 
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h^Cquest (i.e. the affinity chart) has alreao^e 



information to fulfill th^Cquest (i.e. the affinity chart) has alreao^een calculated 
and cached. If so, image map 509 can be retrieved by chart server 5 1 0 from cache 515 
and returned to web server 505 for display to the user. The retrieval of the bit map 
can then cause bitmap request 511, which can then be followed by the retrieval of the 
5 associated bitmaps from cache 515 and the display of bitmap 513 to the user. 

If the information needed to fulfill the request has not been cached, chart 
server 510 can fulfill the request by retrieving information about the requested item 
fi"om database 550. In particular, database 550 can contain data objects 555 that 
contain a distinct affinity value (A) between the requested item (Ol) and each related 

10 item (02). In response to a query by cache server 510, one or more data objects 555 
for the requested item can be retrieved fi-om database 550 and utilized by affinity 
server 520 to generate the necessary information (including, for example, image maps 
and bitmaps) for the selected item. Affinity server 520 can utilize an affinity ranking 
means 540 to rank the related items by their affinity values, and an image layout 

15 program 545 and image layout interpreter 535 to assemble the images. Affinity server 
520 can also utilize image map generator 525 and bitmap renderer 530 to generate the 
actual image map and bitmaps. Once generated, the image map and bitmaps can be 
placed in cache 515 and transferred to web server 505. Web sever 505 can then 
present to the user the visualization of the interrelated data associated with the 

20 requested item. 

Fig. 6 illustrates a type of database structure that can be utilized in the process 
of visualizing large interrelated data sets, according to an embodiment of the present 
invention. Within a particular data set, each item can be identified within a table 605 
in a database by an object identifier (OID) 610 and an object name 615 (that may also 

25 contain object properties). The OID associated with each item is a primary key (PK) 
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620. Two OIDs froi^Hole 605 serve as primary key 620 for tl^^ffinity strength table 
625. These two OIDs are labeled as foreign keys (e.g. FKl and FK2 in field 640) of 
the second affinity strength table 625 within the database. Table 625 can be accessed 
by fixing the OID values to obtain the affinity value 635 between two items identified 
5 by their OIDs (e.g. OID #1 and OID #2 in field 630). 

In an alternative embodiment, the information about the related items within 
the data set can be stored as an XML document with a document type descriptor 
(DTD) rather than in database tables. One example of a DTD for such an XML 
document could be: 

1 0 <?xml version=" 1 .0" encoding="UTF-8" standalone="yes"?> 



interrelated items, and fiirther showing the most closely related items to a principal 
item 711, according to an embodiment of the present invention. In particular. Fig. 7 
shows an affinity chart 709 including an affinity curve 710 having a principal item 
711 and a plurality of related items 712. Each related item 712 can includes a 
25 navigational link 713 and a search link 714. Each navigational link 713 can permit a 
user to generate a new visualization of a data set of interrelated items related to its 
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<!DOCTYPE affinityChart [ 
<!ELEMENT SIMILAR (ITEM,(LIST)*)> 
<!ELEMENT LIST (ITEM)+> 
<!ATTLIST LIST arm (upper|lower) "upper"> 
<!ELEMENT ITEM (SEARCH,NAVIGATION)> 
<!ATTLIST ITEM type (principal [related) "related"> 
<!ELEMENT SEARCH (#PCDATA)> 
<!ELEMENT NAVIGATION (#PCDATA)> 

]> 



20 



Referring now to Fig. 7, there is shown a visualization of a data set of 
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1 ^R^. Each search Hnk 714 can permit a usBr tc 



associated related item ^^P. Each search link 714 can permit a usBrto produce a page 
that has search results or additional information about the selected item. Adjacent to 
the principal item 711 and at one end of a selected string of related items 712 is a first 
sequence element 715, and adjacent to a last item of the selected string of related 
5 items 712 is a second sequence element 716, each of which provide a link to a 

supplementary affinity chart for either more or less remotely relevant strings of related 
items 712. In Fig. 7, principal item 71 1 is shown as '\inion carbide productions". 
This item represents a musical group. Related musical groups having varying levels 
of affinity to 'Pinion carbide productions" are shown as well. For example, while the 

10 related item 712 representing the musical entity "Sarah McLachlan" has an affinity to 
principal item 711, other related items (such as related item 720 representing the 
musical entity "the Cure" and related item 725 representing "R.E.M.") have a stronger 
affinity, as indicated by their closer proximity to principal item 7 11 on the curve. 
Similarly, still other related items may have a weaker affinity to principal item 71 1, as 

1 5 indicated by their further proximity from principal item 7 1 1 on the curve. 

Referring now to Fig. 8, there is shown a diagram of other related items to the 
principal item in Fig. 7, according to the present invention. In particular. Fig. 8 shows 
an affinity chart 819 representing those items that would have been reached as a result 
of selecting first sequence element 715 in Fig. 7. In particular. Fig. 8 shows an 
20 affinity chart 819 including an affinity curve 820 including a principal item 71 1 and a 
plurality of related items 822. Each related item 822 includes a navigational link 823 
and a search link 824. Adjacent to the principal item 7 1 1 and at one end of a selected 
string of related items 822 is a first sequence element 825, which provides a link to 
the affinity chart of more strongly related items, and adjacent to a last item of the 
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selected string of relafSnitems 822 is a second sequence elemeHR26 which provides 
a Hnk to a supplementary affinity chart for more remotely relevant strings of related 
items 822. By selecting the first sequence element 826, the user can navigate to reach 
the affinity curve 7iO to view a string of relevant items as shown in Fig. 7. 
5 The methods and apparatuses of the present invention provide a visualization 

technique that allow a set of related data items to be represented with respect to their 
relationship to a principal data item. The set of related data items may be in any of 
several forms and the display of the related items eliminates duplication and 
information overload for the user. The related items may be displayed in many ways, 

10 including along curved segments or in list form, so as to allow convenient 
visualization of the data and the data's relationship to the principal item. 

While the invention has been described in detail, including references to 
specific embodiments, it will be apparent to one skilled in the art that changes and 
modifications can be made to the invention without departing from the spirit and 

1 5 scope thereof. For example, while a particular embodiment related to musical artists 
has been disclosed, the invention can be applied equally as well for other data types. 
Thus, it is intended that the present invention cover the modifications and variations 
of this invention provided they come within the scope of the appended claims and 
their equivalents. 
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